LOGIC.2[F86,JMC] - www.SailDart.org

perm filename LOGIC.2[F86,JMC] blob sn#832528 filedate 1987-01-14 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00015 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00003 00002	%logic.2[f86,jmc]		Another try at logic for Daedalus
C00004 00003	\input memo.tex[let,jmc]
C00005 00004	\section{Introduction}
C00006 00005	\section{Why logic in AI}
C00012 00006	\section{Situation Calculus}
C00017 00007	\section{History}
C00018 00008	\section{Planning}
C00019 00009	\section{Non-monotonic reasoning}
C00026 00010	\section{Problems}
C00027 00011	\section{Controversies}
C00028 00012	\section{References}
C00029 00013	\centerline{This draft of logic.2[f86,jmc] TEXed on \jmcdate\ at \theTime}
C00030 00014	Outline
C00031 00015	notes:
C00035 ENDMK
C⊗;
%logic.2[f86,jmc]		Another try at logic for Daedalus

\input memo.tex[let,jmc]
\title{MATHEMATICAL LOGIC AND ARTIFICIAL INTELLIGENCE}
\section{Introduction}
\section{Why logic in AI}

	(McCarthy 1960) proposes the {\it Advice Taker}, a
program that would decide what to do by reasoning in logic.  The
name and a large part of the motivation came from contemplating
programs that learn from experience.  Already in the 1950 there
were programs, especially Arthur Samuel's checker program, that
learned the optimal values of numerical parameters.  Also Richard
Friedberg had experimented with a program that learned to do
a simple task by modifying programs at random.  However, parameter
learning is limited in what can be learned.  There may be no
set of values of the parameters that corresponds to certain
important insights.  This was certainly true of Samuel's checker
program.  The problem with learning by randomly modifying a computer
program is that interesting behaviors have extremely low density
in the space of computer program texts.  Therefore, it was proposed
that AI follow what seems to be one major form of human learning ---
learning facts.  However, easier than learning facts by experience
is being told them.  Therefore, it seemed that making programs
that learn from experience should be preceded by making programs
that can take advice in the form of facts.

	In making an {\it advice taker} program, it is important to
maintain an important feature of human advice giving and taking.  Namely,
it isn't necessary for one human to fully understand the mental
structure in order to give advice, whereas to modify a computer
program it is necessary to understand it in considerable detail.
This seems to be a natural feature of presenting information declaratively
rather than imperatively (as in most computer programming languages
including LISP).  (Prolog allows more information to be presented
declaratively than any other current language, but it is still necessarily
to write much of the program imperatively, when the Prolog program has to
do something complicated).

	The {\it Advice Taker} was to work as follows.

	It represents what it knows about goals, general facts about
the world and facts about the particular situation by sentences in
a suitable logical language.  It decides what to do by inferring
logically that a certain action will advance its goals and then
performing that action.  The actions were to include internal
``mental'' actions, including especially the retrieval of
 information from its main memory for the contemplation of the
inference routine.  Observations also resulted in sentences.
Even in the 1960 paper, this reliance on logic was softened
by proposing to include other data structures and programs, but
they were to be under the control of the logic in the sense
that there were sentences saying what they were and they were
activated by actions decided upon logically.

	Fisher Black (1964) did the first {\it Advice Taker}, but
neither it nor any more recent attempt has been fully successful.
The programs are slow, narrowly specialized, and don't think about
their own methods.  Nevertheless, there has been important progress.
with the {\it Advice Taker} approach in the 30 years since it was
first proposed.

	The popularity of this way of doing AI has fluctuated, and the
present is a time of considerable activity, because some problems have
been solved and other approaches have encountered difficulties.  There are
also modified approaches that include some of the content of the {\it
Advice Taker} but represent more of the information imperatively, i.e. put
more in the program and less in the data.  The main content of this paper
is an account of both the difficulties and the progress.  We shall also
discuss the advantages and disadvantages of the modified approaches.

\section{Situation Calculus}
%
	One of the most important areas of application of logic in AI
has been {\it planning}.  The term {\it planning} in AI refers to
determining from the facts available a sequence of actions that
will achieve a goal.  [The planning formalisms used in operations
research may be regarded as a special case wherein the actions
to be taken are predetermined and only their order is to be
determined.]  For planning the key facts are those that describe
the consequences of actions.

  This led in the 1960s to
the {\it situation calculus}
 [McCarthy and Hayes 1969]
 which was intended to provide a way of
expressing the consequences of actions independent of the problem.

	The basic formalism of the situation calculus is
%
$$s' = result(e,s),$$
%
which asserts that $s'$ is the situation that results when event $e$
occurs in situation $s$.  Here are some situation calculus axioms for
moving and painting blocks.
%
\medskip
{\overfullrule=0pt
\centerline{Qualified result-of-action axioms}
%$$\displaylines{∀x l s.clear(top(x),s) ∧ clear(l,s) ∧ ¬tooheavy(x)\hfill\cr
%\hfill ⊃ loc(x,result(move(x,l),s)) = l\cr}$$
$$∀x l s.clear(top(x),s) ∧ clear(l,s) ∧ ¬tooheavy(x)
⊃ loc(x,result(move(x,l),s)) = l$$
$$\displaylines{∀x c s.color(x,result(paint(x,c),s)) = c.\hfill\cr}$$
\medskip
\centerline{Frame axioms}
$$∀x y l s.color(y,result(move(x,l),s)) = color(y,s).$$
$$∀x y l s.y ≠ x ⊃ loc(y,result(move(x,l),s)) = loc(y,s).$$
$$∀x y c s.loc(x,result(paint(y,c),s)) = loc(x,s).$$
$$∀x y c s.y ≠ x ⊃ color(x,result(paint(y,c),s)) = color(x,s).$$
\medskip}
	Notice that all qualifications to the
performance of the actions are explicit in the premisses and that
statements (called frame axioms)
about what doesn't change when an action is performed
are explicitly included.  Without those statements it wouldn't be
possible to infer much about $result(e2,result(e1,s))$, since we
wouldn't know whether the premisses for the event $e2$ to have its
expected result were fulfilled in $result(e1,s)$.

	Notice further that the situation calculus applies only when it
is reasonable to reason about discrete events, each of which results
in a new total situation.  Continuous events and concurrent events
are not covered.

	Unfortunately, it wasn't very feasible to use the situation calculus 
in the manner proposed, even for problems meeting its restrictions.
  In the first place, using general
purpose theorem provers made the programs run too slowly, since
the theorem provers of 1969 [Green 1969] had no way of controlling
the search.  This led to STRIPS [Fikes and Nilsson 1971] which
reduced the use of logic to reasoning within a situation.  Unfortunately,
the STRIPS formalizations were much more special than full situation calculus.
The facts that were included in the axioms had to be delicately chosen
in order to avoid the introduction of contradictions arising from
the failure to delete a sentence that wouldn't be true in the
situation that resulted from an action.
%
\section{History}
\section{Planning}
\section{Non-monotonic reasoning}

	The requirements of science have often led to new branches
of mathematics.  Computer science is no exception.  In the early
1970s the concept of NP-completeness arose and led to the now
classical $P=NP$ problem --- as sharply posed and apparently as
difficult as any classical mathematical problem.  In the late
1970s, formalized non-monotonic reasoning arose from AI, although
hints can be found earlier, and special cases were treated earlier.

	All previous systems of mathematical logic are monotonic
in the sense that if a conclusion $p$ follows from premisses $A$,
and $B$ is a more inclusive set of premisses, then $p$ follows
from $B$.  In the notation of proof theory: if $A\vdash p$ and
$A ⊂ B$, then $B\vdash p$.
Indeed a proof of $p$ from the premisses $A$ is a sequence of sentences
ending with $p$ each of which is a either a premiss, an axiom or follows
from a subset of the sentences occurring earlier in the proof by one of
the rules of inference.  Therefore, a proof of $p$ from $A$ can also serve as a
proof from $B$.  The semantic notion of entailment is also monotonic; we
say that $A$ entails $p$ (written $A\models p$) if $p$ is true in all
models of $A$.  But if $A\models p$ and $A ⊂ B$, then every model of $B$
is also a model of $A$, which shows that $B\models p$.

	Human common sense reasoning is often non-monotonic in that we
reach a conclusions in the absence of a certain premiss that we would not
reach if the premiss were present.  For example, if I hire you to build a
cage for my bird you will plan to put a roof on it to prevent the bird
from flying away but not if I add that my bird is a penguin.  Of course,
the phenomenon of non-monotonicity was noticed long ago, but it was
swept under the rug in various ways, e.g. by speaking of suppressed
premisses or by invoking probabilities.  However, when AI began to
require computer programs to carry out non-monotonic reasoning, then
it became necessary to formalize it properly.

	Several systems of non-monotonic reasoning have been devised,
and as more common sense reasoning phenomena are investigated, more
systems arise.  I shall mention only the circumscription formalism
of (McCarthy 1980, 1986).  This amounts to putting an
ordering on models of a theory and drawing conclusions that are
true in models minimal in the ordering.  Such inferences are
in general non-monotonic, because when a new premiss is added
to the theory the interpretation that was previously a minimal
model may no longer be a model at all.  Mathematically, such
methods of non-monotonic reasoning come to considering minimization
of logical expressions under constraints.  It is interesting that
such mathematical problems arise from common sense reasoning, because
many of the laws of physics are also conveniently expressed as
minimum principles.

	Here are some situation calculus axioms for moving and
painting blocks taken from [McCarthy 1986].

{\centerline{\bf Moving and painting blocks using circumscription}
\overfullrule=0pt
\bigskip
\centerline{Axioms about locations and the effects of moving objects}
$$∀x e s.¬ab(aspect1(x,e,s)) ⊃ loc(x,result(e,s)) = loc(x,s)$$
$$∀x l s.ab(aspect1(x,move(x,l),s))$$
$$∀x l s.¬ab(aspect3(x,l,s)) ⊃ loc(x,result(move(x,l),s)) = l$$
\bigskip
\centerline{Axioms about colors and painting}
$$∀x e s.¬ab(aspect2(x,e,s)) ⊃ color(x,result(e,s)) = color(x,s)$$
$$∀x c s.ab(aspect2(x,paint(x,c),s))$$
$$∀x c s.¬ab(aspect4(x,c,s)) ⊃ color(x,result(paint(x,c),s)) = c$$}
\medskip
This treats the qualification problem, because any number
of conditions that may be imagined as preventing moving or painting
can be added later and asserted to imply the corresponding $ab\ aspect\ldots$.
It treats the frame problem in that we don't have to say
that moving doesn't affect colors and painting locations.
\section{Problems}
\section{Controversies}
\section{References}
\centerline{This draft of logic.2[f86,jmc] TEXed on \jmcdate\ at \theTime}
\centerline{Copyright \copyright\ \number\year\ by John McCarthy}
\vfill\eject\end
Outline
Introduction
\section{Why logic in AI}
\section{History}
\section{Planning}
\section{Non-monotonic reasoning}
\section{Problems}
\section{Controversies}
references
notes:

	The most straightforward way to make a machine think is to make it
to start from a representation of what it already knows by sentences in
mathematical logic and from this to infer logically what it needs to know.
The idea can be said to go back to Leibniz and is also present in the
work of Boole and Frege.  It has proved very difficult but progress has
been made.  The object of this paper is to describe both the difficulties,
the progress and the prospects.

[The converse idea is to look at an animal, human or machine that solves
problems as representing its beliefs in its state.  Systematically ascribe
beliefs and goals to its various states and interpret certain of its
processes as logically inferring the beliefs ascribed to the states that
result from these processes.  Often this ascription of mental qualities
and the principle of rationality --- {\it it will do what it believes will
achieve its goals} --- provide the best way of using what can be observed
about the entity to predict its likely response to stimuli.  Indeed this
is probably why we ascribe minds to each other.  Allen Newell (1980), who
doesn't advocate the direct use of logic, calls this the ``logic level''.]

	Making computer programs that think using logic imposes several
requirements.

	1. Deciding how to represent what the machine is to think about
by sentences of logic, i.e. choosing a particular set of predicates and
functions.  There is an enormous space of choices that can be used for
representing the facts about any interesting phenomenon.  The desiderata
are the following.

		a. The facts about the phenomenon are representable.
This is called physical adequacy of the representation.  It is needed
mainly for general reasoning about the phenomenon.

		b. What the program can know about the facts is
representable.  This is called epistemological adequacy.  The difference
is analogous to representing the state of a gas sample by the position and
velocities of the molecules vs. representing it by temperature and
pressure.

		c. What the program is expected to learn corresponds
to incremental changes in the representation.